Personalization and Clustering of Similar Web Pages

نویسندگان

  • Smita Gupta
  • Anurag Malik
  • Ajay Ohri
  • Andrei Broder
  • Bamshad Mobasher
  • Christian Ricci
  • Michael Dittenbach
چکیده

Over the last decade, clichéd information age has justly arrived. Moreover, the evolution of the Internet into the Global Information Infrastructure, together with the massive popularity of the Web, has also enabled the ordinary citizen to become not just a consumer of information, but also a part of it. In order to make user trouble free, it is required to save his/her time and effort. So some way is needed to give the relevant information to the user in a quick way and also enables to manage the whole lot of data without troublesome. Through this paper, the authors have used tf-idf (term frequency inverse document frequency approach) technique along with the concept of web mining to attain the required solution. Web mining is the application of data mining techniques that aims in discovering the patterns from the Web. Among its different ways, like Web usage mining, Web content mining and Web structure mining, here, efforts are only being made in the field of web content mining. In this work, a windows application is developed which act as a data analysis tool. This application is using the API of Bing search engine. The proposed algorithm is applied on the snippets (short description provided below each search result) of web search results to find those web pages that contains maximum number of query words. Moreover, it also aims at managing the information more easily on client's machine by using simple grouping technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

  One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

Finding Community Base on Web Graph Clustering

Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...

متن کامل

An Heighten PSO-K-harmonic Mean Based Pattern Recognition in User Navigation

The website navigation patterns can be searched and analyzed with the introduction of the new methodology. The user navigation path is stored as a sequence of URL categories in web server. The approaches followed are to separate the users and sessions from the web log files and acquiring the necessary patterns for web personalization. The clustering concept is used for grouping the necessary pa...

متن کامل

Matrix Based Fuzzy Clustering for Categorization of Web Users and Web Pages

Categorization of Web Users and Web Pages are the fundamental tasks of Web Personalization. In this paper it is proposed a Matrix Based Fuzzy Clustering Approach MBFCA and experimentally evaluated the approach for the effective discovery of web user clusters and web page clusters. The use of MBFCA enables the generation of clusters that can capture the Web user’s navigation behavior based on th...

متن کامل

An Application of Session Based Clustering to Analyze Web Pages of User Interest from Web Log Files

Problem statement: With the continued growth and proliferation of e-commerce, Web services and Web-based information systems, the volumes of click-stream and user data collected by Web-based organizations in their daily operations have reached astronomical proportions. Analyzing such data can help these organizations optimize the functionality of web-based applications and provide more personal...

متن کامل

A Technique for Improving Web Mining using Enhanced Genetic Algorithm

World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012